AI in Web Development


Lesson 03

Open In Colab

In [ ]:
import numpy as np
import skimage
import matplotlib.pyplot as plt
import cv2

What are spatial or "local" relationships?¶

image.png image-2.png image-3.png

What kinds of problems can be solved using CNNs?¶

CNNS excel when used on unstructured data

  • Audio
  • Text
  • Images
  • Videos image.png
In [ ]:
from IPython.lib.display import YouTubeVideo
YouTubeVideo("_1MHGUC_BzQ")

The Basics of Convolution¶

Yellow - kernel (called weights or filters)

Green - image

Pink - output of convolution, called an activation or feature map

When CNNs are trained, these kernels are updated during backpropagation to find the optimal values of each of the filters.

image

Lets see it in action¶

http://setosa.io/ev/image-kernels/

In [ ]:
# create a 2D matrix and a 3 by 3 kernel (2d matrix as well)
In [ ]:
# write a function to do 2D convolutions using numpy
In [ ]:
# run function on matrix and kernel with a stride of 1
In [ ]:
# run function on matrix and kernel with a stride of 3

What is an image?¶

An image is technically 3D: (width, height, number of channels). A typical image is RGB format, 3 channels representing red, green, and blue values of each pixel. image.png

In [ ]:
image_file = "https://i.kym-cdn.com/entries/icons/mobile/000/013/564/doge.jpg"
# load file using skimage into numpy array
In [ ]:
# display image using plt
In [ ]:
# print image shape
In [ ]:
# print each channel of the image

How do we convolve images?¶

  • Most images are colored, and have 3 channels (red, green, and blue)
  • Initial filters then, must also be 3 channels deep
  • One convolution is now the dot product of 27 values (3 x 3 x 3) image.png

Extracting Many Features¶

image.png

In [ ]:
# use a top sobel kernel and apply it to the image we loaded

What do the filters learn?¶

image.png

image.png

image.png

image.png

Properties of Kernels¶

Stride number of pixels the filter skips after each convolution. We have shown a stride of one so far.

Stride of 1: image.png

Stride of 2: image.png

Padding: adding pixels to the edges of the image, so the filter fits properly when being convolved across.

  • Zero padding: pad edges with zeros
  • Valid padding: no padding, drop edges of images that doesn't fit
  • Reflective padding: pad edges with reflections of them image.png

image.png

Activation functions for Convolutional Layers¶

In practice, it seems the ReLU function performs the best for image tasks. There is much research on why this is the case, but for now keep it in mind when working with CNNs.

Intuitively, negative features are ones that the network should ignore, vs positive features are ones that the model should focus on.

The Magic of Pooling¶

Downsampling the input to reduce the size and enable the model to generalize feature extraction across varying orientations and scale of the image.

Intuitively: picking the best feature from each window when pooling.

  • Max pooling
  • Average pooling
  • Global pooling

image.png

In [ ]:
# create a matrix and apply 2d max pooling to the matrix using numpy

Preventing overfitting and reducing training time¶

Dropout: we have seen it before

Batch Normalization: normalizing activations (kernels) after a CNN layer.

  • Meaning the output has a mean of 0 and standard deviation of 1

image.png

How do we classify the feature maps?¶

  1. Turn 3D output in 1D array
  2. Input into fully connected layers we have use before
  3. Use output layer and labels to train

image.png

In [ ]:
# print the matrix we are working with again
In [ ]:
# apply global average pooling to the sample matrix

Output activation functions¶

  • Softmax: Creates a distribution where each value is positive and all values sum to 1
    • Best for single-label, multi-class classification
  • Sigmoid: Values will be between 0 and 1, will not add to 1
    • Best for multi-label, multi-class classification
In [ ]:
# create the softmax and sigmoid functions, then apply them to an example 1d array of random values

Recap of CNNs¶

image.png